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listments. The techniques utilized are Box-Jenkins time 
Series analysis and linear regression. А conbined model 
utilizing both techniques is also developed. Tha ability o£ 
the models to predict 1S considered adeguate for three of 
the five ratings and not adeguate for two of the ratings. 
The regression models utilizing 20-24 year old ınemployment 
as the only independent variable yielded surprisingly good 
predictions, once the time series patterns in the data were 


modeled. 
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ion Remo o LLON AND REVIEW OF RELEVANT LITERATURE 


А. INTRODUCTION 


This thesis is an investigation of nethods for 
predicting the rate of reenlistment in the arned forces, 
Specifically the Navy. Since the advent of the all volunteer 
еп 1973, one of the major concerns of the tilitary has 
been the retention of gualified personnel beyond their first 
enlistment. Referred to as first term reenlistients, this 
decision has been the object of extensive study and modeling 
by each of the services. The vast majority of tnis work nas 
centered around the formulation of causal moiels witn а 
heavy emphasis on economic factors. During periods when 
reenlistments have been below the required levels, these 
nodels have been quite good at capturing the effect of the 
economic factors used to suggest the level oi monetary 
compensation. Over the past four years, this situation has 
changed to the point where the services are faced with such 
large numbers of personnel desiring to remain in the 
services, that high reenlistment is actually Lowering the 
numbers of personnel that can be enlisted for some ratings 
at the recruiting stations. As always, there is still a need 
for more personnel in the nuciear related ratiags but scme 
ОЕ ле less technical fields are approaching an end point 
Where only a limited number of billets may be available to 
new accessions. 

The object of this thesis is to attempt to construct a 
бо секси model that could aid in predicting first term 
reenlistments for five selected Navy ratings. Initial models 
will be developed utilizing the Box-Jenkins method of time 


series analysis. These initial modeis will be used to 
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predict the level of reenlistment for the selected ratings. 
Next, а leading indicator model will be developed utilizing 
the national unemployment rate for 20-24 year olis. Then, a 
refined forecast will be developed combining both time 
series and causal models. | 

The potential advantage of time series analysis lies in 
its simplicity anā lack of reliance on external factors. wp 
many, this is viewed as a shortcoming since it cannot 
explain causal relationships such as the effect of adver- 
tising dcllars on sales for a company. However, the efíec- 
tiveness of aivertisirg dollars is not easy to d2termine and 
in many cases, may lead to false conclusions when used in 
Classical regression analysis. Other errors such as tne 
autocorrelation of an independent variable or tke multiccl- 
linearity of several variables are also avoided in the use 
of time series analysis. Since time series analysis relies 
on its ability to reproduce itself over time, this allows it 
to be free of the errors of regression and still retain the 
ability to adequately forecast events: or ia this case 
levels of reenlistment. A more thorough explanation of the 
Box-Jenkins method is presented in Appendix B. 

If a time series model is accurate at predic: ing changes 
in trends as well as levels of reenlistment, then it may 
also be useful as a tool to adjust the levels of reenlist- 


nent bonuses to the most cost effective level 1ecesarry to 


retain the desired force levels. If a model can accurately 
foresee a significant increase in reenlistments, indepen- 
dent of the reenlistment bonus, the bonus level for that 


rating can ke scaled down appropriately to retain personnel 
without the payment of an economic rent. (Rent, inet hiss 
case, 15 the payment of a bonus for a decision independent 
of the bonus award level.) To summarize, time series 
analysis may not prove to be a panacea in forecasting reen- 
listment rates but it most definitely 15 EE 


deserves wider consideration and applica tiene 
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B. REVIEW OF RELEVANT LITERATURE IN THE FIELD 
1. Overview 


To the best of the author's Knowledge, the applica- 
tion of Box-Jenkins Time Series Analysis to “eenlistment 
models has not been previously attempted. А search, 
conducted by the Defense Technical Information Center 
fre), utilizing both title and subject, did not reveal any 
references to its use of Box-Jenkins modeling for reenlist- 
ment.  Bepko (Ref. 1], in his thesis modeling zareer petty 
officers reviews the relevant literature concecning first 
term reenlistment and career reenlistment from 1974 to the 
publication of his thesis in 1981. This review will address 


relevant publications since that time. 


ПА е Опа Zea Cost от Leaving (ACOL) Molel 


This model was developed for the Navy by John Warner 
at the Center for Naval Analysis and 1S currently the most 
widely used model in the Navy with relation to manpower and 
personnel policy decision making. 

The model itself is a sophisticated multiple linear 
regression that attempts to capture several laportant under- 
lying forces in the reenlistment decision. The jg2neral model 


j Of the forn: 


jt - - h-t 
t IFE) + 15 + En / (1*r) = К! (egn 1.1) 


where; 
ПЕ Net present "value от pecuniary and non- 
pecuniary returns of staying in the military until 


time 'n* as compared to leaving at time 't'. 
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р = Monetary returns to militaryñssrvic Wes 
period "t! through Tini 

3. W(n) = Lump sun payment of ће presene 7a ieee 
expected post service civilian wages cealized by 
those staying in the military until time 'n'. 

ц. R(n) = Lump sum payment of the present value of the 
expected retirement benefits realized by those 
staying in the military ne M 

5. W(t) = Present value in year ‘'t' of the expected 
civilian wages realized by those leaving the military 
їп year "'t'. 

6. R(t) = Present value in year 7%! of the expected 
Civilian retirement payments for those Leaving the 
military in year 't' 

7. г = personal discount rate 

As can be seen, this model relies on development of 
several sub-models relating to civilian wage structure, 
future policy decisions concerning lump sum bous payments 
and an individual's personal discount rate over time. While 
the results obtained with this model have been superb, the 
појеј itself is complex and somewhat difficult. Since the 
introduction of the general model described abore, it has 
been further refined to include a ‘taste' factor for mili- 
tary service and a random disturbance term whici is used to 
capture the effects of sea shore rotation, poor dut VS POEM 
and family separation. These factors improve tie model but 


at a cost of ever increasing complexity. 


3. Darling's Model of Marine Corporate nds 


— << A == Е 


Darling (Ref. 2], utilized a combination of 
Box-Jenkins time Series analysis and multiple linear 
regression to predict the supply of upper mental category 
recruits to the Marine Corp. The procedure entarbedadevelopc 


nent of three separate models üsing two distinct techniques, 
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the first was a standard multiple regression of the logit 


НОП: 


– а (М-Мо) 
s(m)= aM, Zb M; ка – 5 Моје | (SA) 
where; 
eaS = Ihs supply of military recruits 
2. М = The military wage 
3. а - The stochastic error term 
In this phase of development, the following variables were 


introduced: 

meer viiian military pay ratio 

2. National unemployment for 16-19 year ola nales 

3. Monthly leads from print media 

4. Number of Marine recruiters 

5. Dummy variable to account for anomalies in a partic- 

ular time period. 

Further analysis led to inclusion of items one, two and five 
from the above list in the final regression model. 

Next, a totally independent model based on 
Box-Jenkins methodology was developed for Marine accessions. 
This waS a univariate model whose purpose was to capture the 
effect of seasonality that was missea by the regression 


model. The Box-Jenkins method yielded a model of the forn: 


2 2 $ 
"t "De Ф, ity tic +0.+е.-0, S CI ШЕН: 152) 


млеге; 
1. У = The number of high school graduates in mental 
categories I and II that enlist in the Macine Corp in 
monti. 


T3 


26 ф = The autoregressive coefficient that describes the 
model 

3. Ө = Тһе moving average coefficient that describes the 
model 

д. е = The error term of theymodel: 

These two models were initially used to generate 
independent forecasts of the actual enlistment sapply. Both 
models were adeguate in capturing trends in the actual 
enlistment rate but were somewhat deficient in the actual 
numbers generated. At this point, Darling combined the tech- 
niques by utilizing the Box-Jenkins method to take advantage 
of the high degree of serial correlation remaiaing in his 
regression models residuals by modeling the residuals asa 
Separate time series. The result of this method was to 
predict ‘error’ terms to be applied to the results of the 
regression equation. This combined model was simmarized by 


-egn 1.4 shown below; 


LSMez = Lovmp £550" (ean 1.4) 


where; 


1.  LSVc = The predicted value of the combinel model 


2. LSVmr = The predicted value using the regression 
model only 
3. 2%" = The Box-Jenkins model 9f the resiluals of the 


regression model 

The resulting combined forecasts were substantially 
better than either of the techniques could acaieve sepa- 
rately and was extremely accurate in capturing the general 


trend of the enlistment supply. 
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Berko [Ref. 1], concentrated his efforts in leveloping a 
multiple regression model for forecasting career retention 
beyond the first enlistment. This model was unique in that 
it evaluated the Navy Rating structure by occupational 
Seoupings and utilized an age specific index for introducing 
unemployment into the model. This was the first work to look 
at ratings and unemploymert together in tnis manner. 

Bepko's [Ref. 1], overall findings were that evalu- 
ating the reenlistment decision by groupings imong caree- 
rists yielded more relevant models for the application of 
bonus payments and that unemployment among th» 25-39 age 
group was a very Significant factor in the ceenlistment 


decision of careerists. 
5. The Thomas 


Dc oH cor honas and ' Liao | Ref. 3], continued 
along the same track as Bepko [Ref. 1], in the examination 
of the reenlistment rate among careerists, ог tiose consid- 
ering their second or subsequent reenlistment. The differ- 
ence inthis model is the unique grouping consideration 
given to the rating structure, where the ratings are aggre- 
gated by the patterns of their past reenlistmant percent- 
ages. The effect of this appears to better 

capture the effect of the significant variables on 
the reenlistment decision. The variables utilized in this 
study were national unemployment, the civilian nilitary pay 
ratio and tenure as expressed by years of secvice. The 
results of the predictions generated by the regression equa- 
tion were excellent and generally less than 13% in total 


error.« 
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This chapter presents the analysis of re-enli stnent data 
by the Box- Jenkins technique. Box-Jenkins analysis involves 
three steps; 

1. Identification - This involves analysis >£ the time 
series plots cf the raw data to try and discern any 
obvious trend or seasonality 

2. Estimate - This step involves analysis of the auto- 
correlations and partial autocorrelations to provide 
an estimate for an initial model 

3. Forecast - This step involves runcing th> models and 
generating predictions which are then evaluated for 
their adeguacy. 

Should any model prove inadeguate, the estimation of the 


model is re-evaluated to find a more suitable molel. 


А. INITIAL ANALYSIS OF THE DATA SETS 


The Box-Jenkins method was applied to data sets of the 
number of first term reenlistees for the followiig ratings: 
1. Yeoman (YN) 
2. Storekeeper (SK) 
3. Operations Specialist (05) 
4. Electronics Technician (ET) 
5. <BOiler Technician (BT) 

These ratings were selected for analysis 2oecause they 
presented a representative mix in mental category groupings, 
varying degrees of general and specific training and also 
provided sufficient numbers of reenlistments to perform 


valid analysis. 
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The data consisted of monthly summaries of first tera 
reenlistment percentages for the subject ratinys covering 
the period from October, 1980 through September, 1983, these 
data were provided by Mr. James McEwan, the statistician for 
the Re-enlistment Programs Development Office (OP-136), a 
Summary of the data is shown in table I. Reenlistment 
percentages for these ratings ranged from a low of 41.2 per 
ЕТ), tO a high of 91.2 per month (ET).! [t should be 
noted here that the time series plots о: the raw data 
neither suggest any clear cut trends or seasonality, nor are 
the series similar across the ratings. 

The time series plots on the following pajg2s represent 
the percentages of reenlistments in each of the’ selected 
ratings, the vertical axis 1s the reenlistment percentage 
and the horizontal axis is the time line. SO E LGA for 
the time line is October, 1980 and the eni point is 
September, 1933. 


TABLE I 
ОЗНА Yeon rote S OF REENLISTMENT RATES 


RATE DATES MEAN ST.DEV. 
YN 10/80- 8/83 55.5 11.9 
SK 10780-9783 49.5 13.2 
05 10/8 0- 9983 43.6 18.0 
ET 10780-9783 91.4 06.5 
BT 10/80- 9/83 DEM 14.7 


_1These numbers are to be viewed ina relative sense for 
ЮТС авио еу to provide sufficient population size for 
their respective models and not as any measurement of 
лез Ред“ an a particular cating. This is to say that these 
percentages do not necessarily measure need in 4a particular 
Eat hg . but rather the percentage of eligibles reenlisting 
ЕШ а QURE , time series ШОШО each data set аге 
Peecentca $n Pig. 2.1 throuwmghwbég. 2.5 
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B. EXAMINATION OF AU TOCORRELATIONS AND PARTIAL 
AUTOCORRELATIONS 


Autocorrelations describes the association between 
values of the same variable but at different time periods. 
Autocorrelation coefficients provide important information 
about the structure of a time series. These coefficients 
can be used to identify trends and possible seasonality 
within the data. (Ref. H] 

Partial autocorrelation is used to identify the extent 
of the relationship between current values of a variable 
with earlier values of that sane variable, while holding the 


effects of all other time lags constant. (Ref. 1] 


ДЕ: Yeoman 


== SS mm = = = 


Examina Cion Өс the autocorrelations foc the Yeonmar 
Џил, Fig. 2.6, suggests that the data is stitionary and 
КООШО not require any transformation prior to model anal- 
ysis. The residuals are within two standard ezrors oí the 
mean zero and appear to be randomly distribiteda. The 
pat lal autocirrelaticns, Fig. 2.7, decay to zero rapidly 
and appear random after this point suggesting tiat an auto- 


regressive model may be appropriate. 


2. Storekeepe 


пи dee c -— 


сато tho autocorrelations fo: this data 
set indicated the residuals met the randomness criteria of 
two standard errors. The shape of the residuals appeared to 
be a decaying sine curve Fig 2.8 The partial iutocorrela- 
tions showed no shape but dropped to zero gradually and were 
ШО ОШТУ distributed, rig. 2.9, not suggesting any clear cut 


model. 
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Specialist data set again indicated 


that the residuals were 


ега мака стега 105 which strongly 


= 


randomly distributed and 


of some type of autoregressive operation 


suggested the use 


mm model selection. 
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trend when the residuis 


The data set for 


Figure 2.11 
ited a strong 


addition there was some non-stationarity suggested, which 
Gould require differencing? to remove. Fig. 2.12 illustrates 
the shape of the ACF function. Evaluation of taie resultant 
autocorrelation snowed that this process may not be neces- 
sary as the first lag exceeds -0.5 in magnitude which is a 
classic indication of an overdifferenced data set. 

The Shape of the autocorrelation andthe partial 
autocorrelation, Fig. 2.13, suggest that an autoregressive 


or possibly a mixed model may be appropriate for evaluation. 


| 
| 


| KK 5 Е P EAS 
| 
| -0.8 -0.6 -0. 4 - 0. 00 oT 9.4 0.6 0.8 | 
NE mum uU 

1 0.494 XXXXXXXXXXXXİ 
| 2 DS XXXXXXXXXXXXK X | 
| БЕ (2529 XXXXXXXXXXXX | 
4 02293 XXXXXXXX | 
| Э 02586 XXXXXXXXX | 
| 6 0.134 OS | 
7 039 x x | 
| 22 UD X | 
0.030 XX | 
10 -0.150 XXX XX | 
ІШ! -0.142 XXXXX | 
12 -0.140 XXX XX | 

LE 0059 XXX 

ШИ 1: -0:126 KX XX | 
| 15 02010 X | 
| | 
| | 
ee | 


Figure 2. 12 AUTOCORRELATIONS FOR ET RATING. 
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“¿The method of differencing converts non-stationary time 
meres лесе а Staticnary one. [ie consists or Subtracting 
successive values from. one another and using tiere differ- 
ENSE aS a mew tame series [ Ref. Ч ]. 
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Figure 2.13 PARTIAL AUTOCORRELATIONS FOR ET RATING. 


5.  Boiier Technicians 


The Boiler technician rating data set 2xhibited no 
strong trend in the autocorrelations, Fig. 2.14. The shape 
of the autocorrelation and рагсја! аптосогве а коти rgi 
2.15, also strongly suggested that an autoregressive tyfe of 
model should be considered for evaluation. This was indi- 
cated by the decaying sine wave pattern in the ACF and the 
abrupt cutoff of the value of the PACF. 


C. MODEL DEVELOPMENT 


Tn developing the models, the autocorrelations апа 
partial autocorrelations were evaluated against representa- 
tive Box-Jenkins models of the autoregressiv2 (AR) and 
moving average (MA) type. A best fit model was tien selected 
for evaluation utilizing the Minitab General Purpose 
Statistical Computing S yc rem The results of these models 
were then evaluated to ensure the residuals wer2 random and 


less than two standard errors of the mean Zero. If the model 
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was Satisfactory at this point, the sum of Sjuared errors 
55 5 ) was evaluated to determine ir it was the lcwest 


possitle reducticn in the SSE. In the event that mcre than 


опе  mncdel passed these tests the t-ratios were tlen 
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evaluated. Again, if all these tests were insignificantly 
different among a family of mcdels the principle of parsi- 
rony was used to select the "test" model. in every rodel 
develcred during this phase of analysis, all three of the 
criteria were met by cnly the model selected which made the 
choice cf the ccrrect prediction model relatively easy to 
select. 

Once these mcdeis were selected, forecasts were gener- 
ated for the period Cctober 1983 to March 1984 and cocmared 
to the actual reenlistment totals for the period. If a model 
was evaluated as totally inappropriate at this point, then 
further investigation and modeling was pursued to atterpt 
resoluticn of the prcklen. 

Table II presents a summary of the model forecasts for 
the selected ratings. Model summaries and statistics are 


presented in appendix C. 
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1. Yeoman Model 


The model selected for the Yeoman ratinj was of the 
ARIMA (1, 0; 0) type, a comparison of the forecasted 
re-enlistment percentages with the actual totals showed an 
average error of .106 with a range from a Low of T ER 
high tof sls. all of the observations were within the 95% 
confidence limits of the model and were also captured in 
Shape by the nodel. Fig. 2. 16 Graphically 11115 гасе - ша 


forecasts and observations. 
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Figure 2.16 PLOT OF UNIVARIATE MODEL FOR YN RATING. 
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2. Storekeeper Model 


= ше LIS od 


best hnecautocorrelations and partial autocorrela- 
tions did not clearly suggest the selection of one modei 
type as being supericr to another, an iterativ2 method was 
used for selection. After discarding several first order AR, 
MA and ARMA models it was decided to try differencing even 
though the data did not strongly suggest that this was 
necessary. The resultant ACF and PACF satisfied the randcı- 
ness criteria without indicating overdifferenciig. A model 
of the ARIMA (0,1,1) order was then found to meet all the 
necesarry stringency requirements for model sel2ction. The 
model generated forecasts with an average error of .132 ani 
a range from -.064 to -.194. It 15 felt that iterative 
modeling with the actual data for the forecast period would 
reduce this error even further. oz 7 allistrates the 
data fit with the model. 


3. Operations Specialist Model 


Modeling of the Operations Specialist Ing 
resulted in the selection of an ARIMA (1,0,0) as the most 
appropriate model. The data for the rating presented the 


largest fluctuation in range over the entire data set. These 
mige tuations have no doubt influenced the uitimate 
predictive power of the selected model. As a cesult, the 
model yielded acceptable predictions varying fron the actual 
observations by an average of .155 and a range from .06 to 
E315. AS with the previous model, successiv2 iterations 
with the new observations should improve the predictive 
power of the model. Pig: 2.18 illustrates th2 models fit 


with the actual observations. 
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Figure 2.17 PLOT OF UNIVARIATE MODEL FOR SK RATING. 


4. Electronics Technician Model 


A ficst order autoregressive model of the ARIMA 
(1,0,0) type was also selected for the electroiics techni- 
Cian rating. The model produced spectacular results with an 
average error of -.025 and a range Of errors Frrom c0 MM 
-.045. It should be noted that this data set is also the 
most stable over time with an average of more than 91% first 
term reenlistments. While this makes the model's job some- 
what easier, it still remains as a powerful model for 
univariate predictions. Fig. 2.19 illustrates th» fitting of 
the mcdel. 
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Figure 2.18 PLOT OF UN IVARIATE MODEL FOR OS RATING. 


5. Boiler Technician Model 


In the case of the boiler technicians, an ARIMA 
ШІ 0,0) waS again evaluated as the most appropriate, 
however; this model was the poorest predictor of any 


selected for evaluation. The average error was .203 with a 
Range from -.328 to „232. The model also failel to capture 
the shape of the actual data which presented an ıpward trend 
while the model indicated a downward turn. Гле model is 
however acceptable for further analysis and r2afirement in 
the transfer functicn model. A la strates the 


fitting of the data set predictions. 
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Figure 2219 PLOT OF UNIVARIATE MODEL FOR ET RATING. 


D. SUMMARY 


In this chapter, five acceptable univariate models have been 
developed for the respective data sets. These molels were in 
most cases fairly obvious from the analysis of the autocor- 
relations and partial autocorrelations, however the process 
can be fairly time consuming and result in the pursuit of 
several “blind alleys" on the way to a «оскара џосе E 
succeeding chapters, a transfer function model utilizing 
unemplcyment statistics for the 20-24 age grup will be 


developed, 
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Figure 2.20 PLOT OF UN IVABIATE MODEL FOR BT RATING. 
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III. THE LEADING INDICATOR, REGRESSION AND COMBINED MODELS 


А. OVERVIEW 


A leading indicator model for reenlistm2nt in the 
selected ratings was constructed by first dsveloping a 
univariate time series model for unemployment ii the 20-24 
year old age jroup and then applying this model to the data 
for the selected ratings. The resultant model residuals were 
then crosscorrelated to establish the location of any time 
leads or lags that affect reenlistments. Thes2 indicators 
could provide an early warning system of shifts in the level 
of reenlistment and/or the direction of the trend in reen- 
listment. Once developed, the adequacy of th» model was 
tested by using the coefficient of determination (R-squared) 
from the indicated lag/lead. Forecasts for the Ozt 83-Mar 84 
time period were also generated in this process and the 
results compared to the univariate models for2casts. A 
combined model using time series analysis and regression 


were alsc formulated. 


B. THE UNEMPLOYMENT MODEL 


The data for unemployment in the 20-24 age bracket was 
collected from monthly publications by the Bureiu Sor EQ SM 
Statistics. These figures cover the period from Jctober 1980 
through September 1983 and are presented in Appendix A. This 
particular age grouping was selected as being the most 
appropriate for personnel completing their first enlistment 


and facing the reenlistment decision. 
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1. Methodology 


The unemployment time series model was constructed 
utilizing the same methodology as applied in tie preceding 
chapter to reenlistment time series for the selected rating 
models. The unemployment data was initially transformed by 
computing the relative percentage change from ore period to 
the next. This was done by subtracting the “cate in the 
current period from the rate in the preceding period and 
dividing the remainder by the rate of the preceling period. 
tieeacing this, it was felt that the resuiting model would 
better capture responses to changes in the unemployment rate 
rather than responses to the overall level. It was further 
hypothesized that this would capture any perceptions by the 
service member that the job market was improving or wors- 
Са an relation to the demand for a particular skill 
[Ref. 5]. 

The computed change in unemployment tim» series was 
then evaluated for a potential model by screeniag the auto- 
correlation and partial autocorrelation functions. The data 
appeared stationary but did not suggest any obvious model 
for selection. As a result a trial and error method was used 
for model selection. The model iterations were evaluated for 
Suitability utilizing the same criteria described in the 
previous chapter, that is evaluation of residuals for 


randomness, smallest sum of squared errors and à t-ratio in 


excess of 2.). Several trials of autoregressive models, 
moving average models and mixed autoregressive moving 
average models Aman ot vield any positive results. 


Therefore, it was decided to difference the data one tine 
and try the iterative model building process again. 

The resulting differenced data set also met the 
ШЕШ о пета шета teria and did not exhibit any sharacteris- 


Miles of an Overdifferencei data set. When evaluated at this 


4 1 


point, the autocorrelation and partial sr l Ln 
suggested that a moving average model was appropriate. This 
model of the ARIMA (0,1,1) type met all of the selection 
criteria necessarry and was therefore adopted for use. Model 


results and specifications are presented in Fig. 3.1 through 
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Figure 3.1 TIME SERIES PLOT FOR UNEMPLOYMENT DATA. 
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Eure 3.2 


БЕБЕ БС Е 


— st w SU 


0.4 


52 
БЕ ee ot ~ 


0.0 


БЕСЕ ООС ИЕ ОЗ? 


WMD DOE С СО СО F NANM 
СС С cN CN OO COL c c att 
Q i (N — O; O OO —— O CO O co OOO O 
% 9 % 9 è o è 9 9 9 9 9 9 9 9 
C9C9COCOCOCOCOCOCOCOCOCOCOCOCO 

N | | | | | | | | 


c7 C0 5 ШОГУ СООО т СЧС zrun) 


— —- m 


PARTIAL AUTOCORRELATIONS FOR UNEMPLOYMENT DATA. 


Figure 3.3 
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TABLE III 
SUMMARY OF UNEMPLOYMENT MA 1 MODEL 


NUMBER ТҮРЕ ESTIMATE Slee DE LR A ume 
1 MA 1 0.9858 00009 14.29 
DIFEERENCING. 11 ВЕБЕ 
RESIDUALS. 55 = 896.9 (ВАСКЕОКЕСАЗВТО h ps) 
DF = 34 MS = 26.4 
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Figure 3.4 AUTOCORRELATION OF RESIDUALS FOR UNEMPLOYMENT MODEL. 
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ШЕН APPLICATION TO THE SELECTED RATINGS DATA SETS 


ili: Yeoman 


When the ARIMA (0,1,1) model was applied to both the 
reenlistment data for yeoman and the differenced unemploy- 
ment data set and the cross-correlation function evaluated, 
there appeared to be a relationship at the twelve month lead 
point. The value at this point was significant wien compared 
to the other points however, it was not significant in abso- 
lute terms since it was not in excess of two standard errors 
Bene mean zero. This observation of magnituda holds true 
ШЕСІ! оғ the models. Fig. 3.5 shows the cross-correlation 


emmerıon for this data set for all lead months only. 
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Figure 3.5 CROSS CORRELATION FOR YN RATING. 
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2. Storekeepers 


The cross-correlation function тог КЕ и сое в 


model indicated a possibie lead indicator relationship at 
the five month and eleven month points. Both of these points 
were significant in their relationship to the ле values 
but again were below the accepted level for determining 
significance. It was decided to investigate the significance 
of these points in the regression procedure. Fig. 3.6 shows 


the cross-correlation function for positive leads of this 


model. 
@ a "o NEN qi 
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Figure 3.6 CROSS CORRELATION FOR SK RATING. 


46 


3. Operations Specialists 


The cross-correlation function again inlicated more 
than one point for possible investigation as being rela- 
tively significant. These points occurred at th» six, nine 
and twelve month points. Fig. 3.7 again Shows the lead 


values for this model. 


ы ~~... 
| -0.8 -0.6 -0.4 -0.2 0:50 Ош? 0.4 Der Orc | 
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- 9 0. 192 ХХХХАХ | 
әс - 0.140 XX Xx X l 
| EE —О. 280 XXXXXXX { 
| -6 0.245 KOA KAR AR ] 
ES  -o9: 034 xX | 
НИ 0.208 XXX XXX | 
| -3 -0. 165 XXXXX | 
| ә) -0.2798 ХХХ АХ | 
| =] -0.038 XX | 
| 0 -0.019 K 
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Figure 3.7 CROSS CORRELATION FOR OS RATING. 
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4. Electronics Technicians 





The cross-correlation function for this model also 
Suggested more than one lead point for investigation, these 
occurred at the six and twelve month points but again were 
significant only in relative terms and not in absolute 
magnitude. Fig. 3.8 illustrates the lead relatioaship in the 


CLrOSS-COFlre lation function: 


[| et 
-0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 | 
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| -10 0.044 EX | 
-9 0. 179 X X XX X 
| -8 0.014 X | 
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-6 0.288 XX ZX X XX | 
-5 0.091 XXX 
-4 -0.223 XN XX Xe | 
| -3 0.038 Х | 
-2 -0.114 ооо 
-1 02 XXX 
0 -0.017 X 
| i 
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Figure 3.8 CROSS CORRELATION FOR ET RATING. 
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53 Boiler Technicians 


Consistent with the previous models, tie model for 
Boiler Technicians also presented more than one prominent 
point for evaluation. These points occurred at the nine and 
Ве month points. Fig. 3.9 is the cross-correlation func- 


Girone Lor the lead values of the model. 
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Figure 3.9 CORRELATION FOR BT RATING. 
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D. LINEAR REGRESSION 


In order to verify the leading indicator models, a 
linear regression will be constructed using 20-24 year oid 
unemployment as the independent variable and reenlistment 
rates for the selected ratings as the dependeat variable. 
For this process to be valid, certain assunptions are 
required prior to application of the process. Тһе first 
assumption is that there is a linear relationship between 


the variables as described by egn. 3.1 


м Ет (eqn 3.1) 


where for each observation, Y is a random variable. The 
second assumption is that X is fixed in value and the final 
assumptior is that e, the error term, has an expected value 
of zero with constant variance for all observations. It is 
further assumed that the e's are normally distributed and 
uncorrelated. [Ref. 6] 


E. MODEL VERIFICATION 


Linear regression models were run for all of tie data sets 
using the lagged value of the unemployment data as the inde- 
pendent variable and the reenlistment data as tae dependent 
variable. The regressions were evaluated using taie R-sguared 
value, Durbin-Watson statistic and t-ratio A sSsunmary of the 
regression models R-sguared and Durbin Watson statistic is 
presented in Table IV. 

The regression models for all of the ratinJs were less 
than robust in their ability to verify the signa i lc n D 
the leading. indicator models. R-squared values ranged from a 
high of .34 for the Yeoman model to a distade-2.02 Storer 
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Boiler Technician ncdel. These values represeat the best 
R-squared that were obtained at all lagged values of the 
independent variable and not just the ones that were indi- 
cated as being significant in the leading indicator model. 
t-ratios were somewhat stronger ranging fran REO 
Storekeepers to 11.3 for Electronics Technicians with three 
of the five ratings having values above the minimum accept- 
ance level of 2.0.  Zhe Durbin-Watson statistic was weak for 
the models as well with only the Storekeeper model clearly 
mn the acceptable range of 1.5 to 2.5 for the data. 
This indicates that the regression failed to rz=move all of 
the serial correlation present in the data sats and the 
residuals are positively correlated to one another. 

While these models are disappointing, tiey are not 
discouraging. They seem to indicate that, when taken alone, 
unemployment does not possess the strong predictive ability 
it seems to have in other econometric models. Bepko 
ШЕЕ. 1], and Darling [Ref. 2], іп their regression models 
attribute nearly 50 percent of the explained sun of squared 
error to unemployment, this may indicate that this relation- 
ship may not hold іп modeling the behavior of first terrm 
reenlistments for military personnel. It should be noted 
that Berko (Ref. 1], constructed an aggregated model of 
careerists using the 25-39 age group for unem>loynment and 
Darling [Ref. 2], utilized national teenage unenployment in 
modeling Marine Corp enlistments. 

The predictions for the regression model are compared to 
the observed levels of reenlistment for the period October 
1983 to March 1984, these forecasted levels ace shown in 
table VI In addition, the regression forecast foc the entire 
data set from October 1980 to September 1983 will be gener- 
ated and the residuals for that data set will be indepen- 
dently modeled as а new time. series with a forecast of 


residuals to be applied to the forecasts from th2 regression 
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TABLE IV 
SUMMARY OF R-SQUARED AND DURBIN-WATSON STATISTICS 


X=UNEMPLOYMENT FOR 20-24 YEAR OLD AGE GROUP 
SELECTED RATINGS a=-SOUARED Rar 


LAGS (X) (DURBIN-WATSON STATISTIC IN PARENS; 
B YN SK 05 ЕТ ВТ 
1 .265 222 „027 aan -.03 
(1.54) (1.7) (373 ӨР) EE 
2 ЕЭ 2154 2078 „141 -.03 
(1.56) (1281)  (.83) (1.2) (87 
3 „30 . 199 „025 . 133 -.031 
(1.55) (1-56) “Y. 81) (1.22) (284) 
ü „349 . 163 0087 . 115 - .027 
(1.03) р (1.51) (283) 
5 2218 ‚134 „031 . 041 -.023 
(1.3) {1.76) C81) (1.39) (2814 
6 .259 . 229 - 006 Noe -.032 
(1.31) — (1.26) (ШЕ (1.55) (79) 
7 Ее .253 . 002 .003 -.035 
(1.25) (1:75) ee) (1.6) (281) 
8 „21 2220 . 004 -.009  -.034 
(1.16) 11.6) (. 79) (1.65) (270) 
9 2148 . 183 = Озо -.027 
(1.16) (1.58)  (:81) (208) (S 
10 -171 . 193 -.023 .024 -.027 
(1.14) (1:69) (279) (2.08) (75) 
11 „098 „115 -.013 -.011 -.043 
(1.3) (1.9 7 у (2.08) (2.75) 
12 22 08 Ий -.045 .051 ~.042 
(1.66) (2.24) (271) (2209) (.8) 


models. This procedure should yield forecasts with a better 
fit than was possible with only the univariate or regression 
models. This procedure closely parallels the work of Darling 
(Ref. 2], in modeling the supply of recruits foc the Marine 
Comp. The combined model will be developed in the next 


section. 
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П ТНЕ COMBINED MODEI 


The regression medels in the previous section yielded 
adeguate forecasts of the reenlistment rate, tn2y were not, 
however, Significantly better than the forecasts for the 
univariate models in chapter three. Madutionally, the 
residuals of the regression model exhibited strong positive 
serial correlation as indicated by the low values of the 
Ducbin-Watson statistic for each model, Table IV summarizes 
the data for the regression models. As Shown by Darling 
[Ref. 2], this enables the residuals to be constructed into 
an independent time series. Through the application cf the 
Box-Jenkins method a forecast of the residuals or error 
terms can be generated and applied to the regression models 
forecasts. This procedure should yield a forezast witha 
better fit to the actual data. 


TABLE V 
REGRESSION MODELS FOR SELECTED RATINGS 


| B "rac ОИ БИШИ БЕТАНИ БЛУ с R-SORD WATSON 
| УМ 4 3205 3292 9.97 . 1.43 
| EK 7 159 3:97 (oe 2032 1. 75 
OS H 73.2 =z22)3 16.32 „037 „81 
| ET 1 552 J. | o uo 1. 22 
| BT 5 E zs 14.93 5:02:59 .84 





ШОСЕ СЕОБИ methodology as described in Chapter 
three was again applied to the sets of regression model 


residuals and appropriate models were selected. Table VI is 
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a summary of the forecasts for all three nethods and 
percentage of error for each forecast. 

The combined model resulted in improved forecasts in 
three of the five ratings with the other two showing either 
minor improvement or a slight decline (BT, OS). It should be 
noted at this point that these two data sets were the most 
volatile in terms of range of observations. Tie forecasts 
for these ratings could be improved by eliminating signifi- 
cant outliers from the data sets and recomputiny all oí the 
models. Due to the already small size of the data sets, this 
was not considered. This problem should be corrected in 
future works in this area as more data points become avail- 


able fcr analysis. 
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TABLE 
FORECAST COMPARISONS OF ALL THREE MODELS 
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IV. SUMMARY AND CONCLUSIONS 


A. SUMMARY 


Several forecasting techniques have been 2xamined іп 
this thesis in an attempt to predict the pattern of reen- 
listments in five specific ratings. Two distinct methods 
were used to build three models; a univariate Box-Jenkins 
model, a linear regression model and a combined regression 
and Box-Jenkins model. 

The results of each varied in predictive ability, ыш 
the combined model being clearly superior to the other two, 
(as measured by percent error of the actual observations). 
but with the results by rating differed sharply within each 
model. For electronics technicians all three models were 
clearly adequate, this is not surprising since this rating 
had the smallest range and the least variance in reenlist- 
ments during the time period examined. The regression equa- 
tion for ET's yielded a very low R-sguared value of the 
model. Appropriate additional explanatory variibles may be 
tne level of reenlistment bonuses or the availability of 
advanced technical training. Boller technicians and opera- 
tions specialists showed the widest range in ceenlistment 
percentages, and, as expected, their models exhibited the 
least accuracy. The regression eguations for these ratings 
were counterintuitive in that they indicated higher reen- 
listment rates at successively lower levels of the indepen- 
dent variable, unemployment. This indicat>s that ап 
additional independent variable may be required in the 
equation for these ratings. For BT's this may ре а dummy 


variable accounting for the .unpleasantness of the working 
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conditions or the level of their reenlistment bonuses. For 
the OS rating, it may also Ге the reenlistment bonus level 
aua factor accounting tor the high amount of sea duty 
present in that rating when compared to others. The ratings 
of yeoman and storekeeper presented models that were 
macginal when using Pox-Jenkins or regression sedarately but 
presented quite good predictions when utilizing the combined 


model. 


A somewhat surprising resuit of the models forecasts was 
the accuracy of the regression model using 20-24 year old 
unemployment as the only independent variable. This is 
surprising in view of the low R-squared values of tke models 
and the high degree of serial correllation remaining in the 
residuals as expressed by the Durbin-Watson statistic. This 
was actually the second most accurate prediction model 
outperforming the univariate Box-Jenkins model əy a slight 
Margin. 

The Box-Jankins models‘ performance was restricted by 
the size of the data set available. Technically, thie evr or 
more observations ina data set are considered sufficient 
but 100 or more observations are considered desirable іп 
order to utilize the full predictive power of the model. 
This larger number of observations is also consilered desir- 
able in terms of identifying the underlying trends and 
patterns which may nct appear in a smaller set bf data. In 
terms of forecasting reenlistments, it is not possible at 
this point to utilize any more data points than were avail- 
able for this study since the monthly figures ar2 aggreyated 
by quarters after three years and only the quarterly data 


are retained. 
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B. CONCLUSIONS 


A surprising finding, for all of the ratings под: Гец, K 
continued rise in reenlistments in view of the ever 
improving economy during the period. This could possibly be 
explained, in a regression model, with the introduction of 
the civilian/nilitary pay ratio for the period or the level 
of reenlistment bonuses for a rating. This vould stile 
however, not account for the 35 percent reenlistien (yeaa 
for electronics technicians who are generally regarded as 
having the most desireable and marketable skills in almost 
any employment market. Another explanatory term coulda be 
introduced for "taste" for military service much as the ACOL 
model uses. In light of the world situation and recent 
events in Lebanon, Granada and the Persian Gulf this could 
be a significant explanatory factor for the continued rise 
in reenlistments. 

In terms of policy implications, the results for all of 
the models utilized indicate that high levels of first term 
retention are likely to continue in all of these ratings for 
the next six to twelve months. At this point, decisions will 
be required on how to deal with these increases in a service 
that is rapidly approaching authorized end stcength. The 
longer range forecasts still seem to indicate that the 
currently favorable climate will eventually giv2 way to аг 
ever improving economy. Now would seem to be the most oppor- 
tune time to take advantage of the situation by increasing 
the total number of personnel in the career force as a hedge 
against the future change in the demographics of the cohort 
eligible for military service. This will indice а Shome 
term increase in compensation costs by increasing the inven- 
tory of career petty officers. This will eventually be 
offset by the reduction in future training ani recruiting 


costs that will result from this larger career £5rce. 


эа 


EC SUGGESTIONS FOR FUTURE RESEARCH 


As previously stated, the pee potential of the 
Box-Jenkins method has not been fully exploited because of 
restrictions in the amount of data available. Firther, this 
restriction can only be corrected with the passage of time 
as more observations become available. The models presented 
in this paper were  rating-specific, which may only have 
limited application. Іп а broad sense, however, research 
should continue along these lines with aggregat> models of 
rating groups. As to how this aggregation should be 
performed, Thomas and Liao [Ref. 3], have suggested grouping 
ratings Бу observed reenlistment behavior іп th2ir model of 
second and subsequent term careerists. This grouping should 
be conducive to application of Box-Jenkins techaiques which 
appears to be more effective when dealing with a data set of 
narrow range. Another possibility for grouping sould follow 
the level of skill required as indicated by a cating being 
termed high-tech, medium tech or low tech (Ref. 1]. 

The combined regression, Box-Jenkins model presented 
here also deserves future consideration as it adpears to be 
a viable "fine tuning" method for regression models and 
intuitively more appealing than introducing mote and more 
variables into the analysis. Use of a combined modeling 
technique can only serve to strengthen the results of 
regression models that are currently very populac. 

The Box-Jenkins method is not meant to be aa all encon- 
passing method for use in manpower modeling. [Speer tain i) 
Should, however, be considered as a tool to be placed in the 
arsenal of the manpower planner for continued us? and devel- 
opment. In view of the many commercial software packages 
available for this technique, implementation and application 
to manpower issues should most strongly be considered for 


use in Navy manpower planning. 
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THE BOX-JENKINS METHOD 


А. OVERVIEW 


The Box-Jenkins procedure can be used to fit and fore- 
cast time series data by means of a general class of statis- 
tical models. An observation at a  jiven point in time is 
modeled as a function of its past values and/or current and 
past values of the random errors, both at seasonal and non- 
seasonal lags. Box-Jenkins methodology will model a variable 
with observations equally spaced in time and no missing 
values. Sometimes it may be necessarry, before nodeling the 
r S, tO transform the data by taking the log function, 
square root, power of the series or to differencz the series 
on a seasonal or non-seasonal basis. 

The modeling of time series data is usually done in 
three steps. First, identify a tentative molel for the 
series. Second, estimate the parameters3 and examine the 
diagnostic plots and statistics. Third, 1f tie model Ts 
deemed acceptable, utilize the procedure for forecasting. If 
the model is inadequate, return to step one and =valuate the 
time series for more appropriate models until a1 acceptatle 
one is found. Fig. B.1 illustrates the steps required in 
Box-Jenkins analysis. 

The advantage gained in using Box-Jenkins analysis is 
that it allows the data to speak for itself sSilce it isa 
univariate procedure and therefore does not ailow for expla- 


natory variables. The underlying more restrictiv2 assumption 


3For most software Beer applied to Box-Jenkins 
modeling, the estimation of parameterS is done aitomatically 
mthe progran leaving tae researcher free to concentrate on 
analysis or the resulting statistics. 
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throughout all Box-Jenkins procedures 15 "пар the cig 
series will eventually repeat itself (Ref. 7], oc that there 


is some pattern underlying the data. 


B. WHAT IS A TIME SERIES 


A time series is a collection of observatio1s generated 
sequentially over time at specific intervals such as hours, 
days, weeks, months or years. In addition, a ceztain depen- 
dence is supposed from one period to the next. It is this 
interdependence that is of value when trying to forecast 
future activity for a time series. Examples of time series 
abound in fields ranging from business to physics and are 
applied to analyze monthly sales for a company, quarterly 
yields on fiduciary notes or the chemical yield 25 а certain 
substance in a contrclled procedure. A time series can aiso 
be used to analyze observations that are either discrete or 
continuous; by way of example, a discrete time series would 
be the closing stock price of a company and a continuous 
time series would be the temperature at the weather center. 
In summary, a discrete time series is one where observations 


exist at a point in time anda continuous time series has 


potential observations at all points in tine. For purposes 
of this discussion, only discrete time series are be 
addressed. 


С. STATIONARITY 


In pursuing a time series model, the first assumption to 
be made in the analysis is that the data set is stationary. 
Ву this it 15 meant that the observations oscillate around a 
constant mean that shows no growth over time. Deviations 
about this mean are temporary and in the long run display 


equilibrium about the mean [Ref. 4]. 
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Figure B.1 THE BOX-JENKINS METHODOLOGY. 
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A further measure of Stationarity can be gained fron the 
autocorrelation of the time series, that is the correlation 
between succesSive observations from the same dita Set. An 
observation at time t, denoted by Zt, when correlated with 
an observation Zt+1 from the same data set is said to 
produce an autocorrelation. The autocorrelation is measured 
by pk and provides important information about tie nature of 
the data set. A value close to +1 indicates a hijh degree of 
positive correlation between observations, while a value 
close to -1 indicates a high negative correlatioi. 

Most time series are not Stationary anc require some 


type of transformation prior to analyse. 


D: TIME SERIES TPLOT 


The first step in determining whether a tine series is 
stationary or not is to construct a time series plot of the 
data which plots observations against time in an attempt to 
visually determine any obvious patterns in the data. Fig. 
B.2 illustrates the United States gross national preduct for 
the years 1947 through 1970 on a quarterly basis. This plow 
shows a clear upward trend in the data which iadicates the 
data is not Stationary, further within each year there are 
apparently recurring patterns for each quarter .that repeat 
On annual basis. Finally, as time passes, the variance in 
GNP tends to become larger and more volatile. Clearly, this 
data set must be transformed prior to further analysis by 


the Box-Jenkins method. 


E. DATA TRANSFORMATICN 


To continue with the example of GNP, there are several 


possible transformations that can be used to iniuce statio- 


narity. The first step is to induce a constan aria x 
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Figure B.2 UNITED STATES GNP 1947 - 7970. 
the data, this can be accomplished either through a 


ЕС or a square root transformation. Fig. 8.3 illus- 
trates the results of a square root transform of the data. 
The trend is still clearly present but the variaice has been 


smoothed considerably. 


Once variance has been stabilized, the nex: step is to 
remove the trend. There are several sophisticated regres- 
sion techniques available to accomplish this however, the 
method of differencing will be the only one addcessed here. 
For a more detailed discussion of these alternative techni- 
gues, the reader is directed to Makridakis ani Wheelright 
[Ref. 4). 

The method of differencing a time series sonsists of 
sübtractiıing the values of the time series from eich other in 


a specified order. By way of example, consider tae following 
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Figure B.3 SOUARE ROOT TRANSFORMATION OF U.S. 3NP 1947 - 1970. 


data set; 1,3,5,7,9,11. When plotted, this сес һас Ы 
obvious linear trend, a first order difference involving 
subtraction of the first observation fron the second, the 
second from the third and so on results in the transformed 
сеггез 22222002722 By taking the first diff2rence, the 
trend disappears and yields a stationary data set. The 78 
indicates that whenever a data set is differ2nced, one 
observation is lost for each difference operatoc. Due to 
random fluctuations in real data, such clear cut results as 
those illustrated should not be anticipated however, for the 
Majority of data sets differencing will induce a sufficient 


amount of stationarity to proceed with further aialysis. 
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F. AUTOCORRELATION AND PARTIAL AUTOCORRELATION 


A useful tool in model estimation is the aut) correlation 
function (ACF), which can be defined as the association or 
mutual dependence between values of the Same variable but at 
different time periods. These ACF coefficients provide 
valuable information about a data set and any pattern that 
may be present. If, for example, a high positive coefficient 
appeared every twelve months, a seasonal tr2nd may be 
considered to exist. 

The partial autocorrelation function, (PACF), is ancther 
and complimentary measure to be applied along with the ACF 
to aid in determining model type. PACF's are analogous to 
ACF's in that they indicate the relationship of the values 
of a time series to various time lagged values of the same 
series. They differ from ACF's, however, in tiat they are 
computed for 2ach time lag after removing the effect of all 
other time lags. In essence, they show the relative strength 
of the relationship that exists for varying time lags. 

When the ACF andthe PACF are analyzed tog2ther, they 
provide a very powerful tool for initial model selection. 
Fig. B.4 through Fig. B.6 summarize the general shapes asso- 


Clated with the different types of models. 


G. THE AUTOREGRESSIVE MODEL 


A time series 15 said to be governed by an autoregres- 
sive, (AR), if the current value of the time ssries can be 
expressed as a linear function of the previous value or 
values plus some error term or random shock valie [Ref. 8]. 
The assumptions made here are that the data set is 
Stationary and the error terms are normally and indepen- 
dently distributed with a mean of zero and constant vari- 
ance. A check on the' adequacy of the model is to construct 
an ACF for th2 residuals of the model and deternine if they 
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are random in nature. Mathematically, an AR (1) model is of 


the For m: 


2 = ф 7 + a (eqn B.1) 


where $ is egual to the autoregressive coefiiciznt anda is 


the random error or Shock tern. 


H. THE MOVING AVERAGE MODEL 


A time series 1S said to be governed by a moving average 
process if the current value of the time series can be 
expressed as a linear function of the current error term and 
previous error term(s). The same restrictions ipply to the 
error terms of am MA model aS applied to the AR model. 


Mathematically, this function can be expressed as; 


Ze БОС" ба - (еап В.2) 


where 6 15 the moving average coefficient and ais again the 


error term. 


1. THE MIXED AUTOREGRESSIVE MOVING AVERAGE MODEL 


The mixed autoregressive moving average nmolel contains 
elements of both the AR and the MA procedures aid expresses 
the relationship of a current observation as a Linear func- 
tion of both past values and past errors of the variable. As 
with the AR and MA models, the residuals of the model are 
evaluated for adequacy by utilizing the ACF zuset Tore Fr 


equation for the ARMA model is: 
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Figure B.4 TYPICAL FORM OF AR1 MODEL ACF AND PACF. 


2+ = Ф 1 2-1 5 Bus = | (eqn B.3) 


J. EVALUATING THE MODEL 


Once a model has been selected, there are several ways 
to check the results for adequacy. For purposes of this 
discussion, the following checks will be addressed: 

1. ACF of residuals 

2. Minimum sum of squares 

eee at} 
There are several other checks for adequacy that are avail- 
able to the user of Box-Jenkins methodology, for a nore 
comprehensive explanation the reader is directed to Vandaele 
[Ref. 8]. 
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Figure B.5 TYPICAL FORM OF MA1 MODEL ACF AND РАСЕ. 
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As mentioned throughout this discussion, the ACF for 
a model should be random about the mean zero with constant 


variance and a magnitude iess than two standard errors. 


2. Minimum 


lun 


um of Squares 


Determination of this measure can only be achieved 
by comparison with other potential models. In some cases, 
several models may produce insignificantly diffe-ent sums of 
squares which will test the user's judgment and application 


of the other measures of adequacy. 
p toa lo 


In time series analysis this is computed by dividing 


the estimate of the parameter for the model by the standard 
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| deviation for the series. The rules as appliel to regres- 


sion analysis still hold in that the value shouli be greater 


| 


x HL 2.0 in order to indicate that the cozfficient is 


Significantly different from zero. 


1 


К. PARSIMONY 


In the event that more than one model is capable of 
Satisfying the acceptance criteria described ibove, the 
principle of parsimony will then apply. This states that 
when faced with several sufficient model types, select the 
the lowest order model available that Satisfies the 


criteria. 


L. TRANSFER FUNCTION MODELS 


Also known as multivariate autoregressive integrated 
moving average, (MARIMA), or leading indicator models. This 
involves selection of an appropriate univariate model for 
what is to be the independent variable and applying it to 
the dependent variable. The application of the model will 
result in two sets of residuals which when cross correlated 
at different time lags will yield the cross correlation 
function, (ВК ја tnis diiil ers slightly from the АСЕ 
discussed earlier in that we now expect to find a relation- 
ship of significant magnitude at various poiıts in the 
comparison. This positive correlation is an indication that 
there is a significant relationship between the independent 


and the dependent variable at certain lags or leıds in tine. 
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Figure B.6 TYPICAL FORM OF ARMA 1,1 MODEL ACF AND 
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SUEMARY OF BOX-JENKINS MODELS USED FOR ANALYSIS 


A. MODELS USED IN UNIVARIATE ANALYSIS 


1. Yeoman 
Model Type - ARIMA (1,0,0) 
Model Equation 


Z = - 38 82. _ nd 


T-ratio - 2.64 
Model Residuals - Randon 
2. Storekeepers 


Model Type - ARIMA (0,1,1) 


Model Equation 


Z = -.814a.. + a 


\ ДЕ 


T-ratio - 7.24 


Model Residuals - Randon 


ES 


(eqn C.1) 


(eqn C.2) 


3. Operations openi tS 


Model Type - ARIMA (1,0,0) 


Model Equation 


2 = .4992.. LE (edn Ca3) 
Tratio =) See У 
Model Residuals - Random 
4. Electronics Technicians 
Model Type - ARIMA (1,0,0) 
Model Equation 
2 = - 5882... ra (есп C.4) 
T-ratio - 4.18 
Model Residuals - Random 
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pe Boiler Technicians 
Medel tf pe - ARIMA (1,0,0) 


Model Equation 


2 = 58 12-4 + а. (eqn С.5) 


T-ratio - 3.63 


Model Residuals - Randon 


B. BOX-JENKINS MODELS OF REGRESSION RESIDUALS 


Model Type - ARIMA (0,0,1) 


Model Equation 


DL 225150202 t a (egn C.6) 
T-ratio - -2. 19 
Model Residuals - Random 


15 


2. storekeepers 


Model Type -TARIMA (17,177 


Model Equation 


2 = .4982 Ба ПБ ЕС. 7) 
T-ratio - -3.00 
Model Residuals - Randon 


3. Operations ВЕСА sr 
Model Type - ARIMA (1,1,1) 


Model Equation 


Z = 6082 t'a ME TM 


| E (eqn CVS) 


\ 


T-ratio - AR1 - 3.28 


Model Residuals - Random 
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E Lectronics Technicians 


Model Type - ARIMA (1,0,1) 


Model EJuation 


ПИ оци. + a 2.3292, _, (Есі С23) 

T-ratio - AR1 - 5.21 
MAL = 2.04 
Model Residuals - Random 
5. Boiler Technicians 

Hodell Түре = ARIMA (1,0,0) 
Model Equation 

Z = -2902.._\ tan (ecn C. 10) 


T-ratio - 3.70 


Modei Residuals - Random 
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